Saint Louis County
Reconstruction-Driven Multimodal Representation Learning for Automated Media Understanding
Benhammou, Yassir, Kalyan, Suman, Kumar, Sujay
Broadcast and media organizations increasingly rely on artificial intelligence to automate the labor-intensive processes of content indexing, tagging, and metadata generation. However, existing AI systems typically operate on a single modality-such as video, audio, or text-limiting their understanding of complex, cross-modal relationships in broadcast material. In this work, we propose a Multimodal Autoencoder (MMAE) that learns unified representations across text, audio, and visual data, enabling end-to-end automation of metadata extraction and semantic clustering. The model is trained on the recently introduced LUMA dataset, a fully aligned benchmark of multimodal triplets representative of real-world media content. By minimizing joint reconstruction losses across modalities, the MMAE discovers modality-invariant semantic structures without relying on large paired or contrastive datasets. We demonstrate significant improvements in clustering and alignment metrics (Silhouette, ARI, NMI) compared to linear baselines, indicating that reconstruction-based multimodal embeddings can serve as a foundation for scalable metadata generation and cross-modal retrieval in broadcast archives. These results highlight the potential of reconstruction-driven multimodal learning to enhance automation, searchability, and content management efficiency in modern broadcast workflows.
- Information Technology > Information Management (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
"Monuments," Reviewed: The Confederacy Surrenders to a Truer American Past
As the Trump Administration tries to rescue symbols of the Lost Cause, an exhibition in Los Angeles, led by Kara Walker, finds meaning in their desecration. Kara Walker's "Unmanned Drone" (2023) transforms a Stonewall Jackson statue. The first thing you see is a horse's ass, protruding, upside down, from the thorax of a monster. A man's arm descends from the beast's stomach, his gloved hand clutching the blade of a fallen sabre. Every part of the work comes from a statue of the Confederate general Stonewall Jackson that was removed from Charlottesville, Virginia, in 2021.
- North America > United States > California > Los Angeles County > Los Angeles (0.25)
- North America > United States > Virginia > Albemarle County > Charlottesville (0.24)
- North America > United States > New York (0.06)
- (9 more...)
Podcasts as a Medium for Participation in Collective Action: A Case Study of Black Lives Matter
Moldovan, Theodora, Pera, Arianna, Vega, Davide, Aiello, Luca Maria
We study how participation in collective action is articulated in podcast discussions, using the Black Lives Matter (BLM) movement as a case study. While research on collective action discourse has primarily focused on text-based content, this study takes a first step toward analyzing audio formats by using podcast transcripts. Using the Structured Podcast Research Corpus (SPoRC), we investigated spoken language expressions of participation in collective action, categorized as problem-solution, call-to-action, intention, and execution. We identified podcast episodes discussing racial justice after important BLM-related events in May and June of 2020, and extracted participatory statements using a layered framework adapted from prior work on social media. We examined the emotional dimensions of these statements, detecting eight key emotions and their association with varying stages of activism. We found that emotional profiles vary by stage, with different positive emotions standing out during calls-to-action, intention, and execution. We detected negative associations between collective action and negative emotions, contrary to theoretical expectations. Our work contributes to a better understanding of how activism is expressed in spoken digital discourse and how emotional framing may depend on the format of the discussion.
- North America > United States > California (0.14)
- Europe > Austria > Vienna (0.14)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
- (11 more...)
- Law > Civil Rights & Constitutional Law (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law > Criminal Law (0.93)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.34)
Towards Unraveling and Improving Generalization in World Models
Fang, Qiaoyi, Du, Weiyu, Wang, Hang, Zhang, Junshan
World models have recently emerged as a promising approach to reinforcement learning (RL), achieving state-of-the-art performance across a wide range of visual control tasks. This work aims to obtain a deep understanding of the robustness and generalization capabilities of world models. Thus motivated, we develop a stochastic differential equation formulation by treating the world model learning as a stochastic dynamical system, and characterize the impact of latent representation errors on robustness and generalization, for both cases with zero-drift representation errors and with non-zero-drift representation errors. Our somewhat surprising findings, based on both theoretic and experimental studies, reveal that for the case with zero drift, modest latent representation errors can in fact function as implicit regularization and hence result in improved robustness. We further propose a Jacobian regularization scheme to mitigate the compounding error propagation effects of non-zero drift, thereby enhancing training stability and robustness. Our experimental studies corroborate that this regularization approach not only stabilizes training but also accelerates convergence and improves accuracy of long-horizon prediction.
- North America > United States > California > Yolo County > Davis (0.04)
- North America > United States > South Carolina > Charleston County > North Charleston (0.04)
- North America > United States > South Carolina > Charleston County > Charleston (0.04)
- (2 more...)
The Download: Chinese LLMs, and transforming heavy-duty trucking
When police departments first started buying and deploying bodycams in the wake of the police killing of Michael Brown in Ferguson, Missouri, a decade ago, activists hoped it would bring about real change. Years later, despite what's become a multibillion-dollar market for these devices, the tech is far from a panacea. Most footage they generate goes unwatched. And if they do finally provide video to the public, it usually doesn't tell the complete story. A handful of AI startups see this problem as an opportunity to create what are essentially bodycam-to-text programs for different players in the legal system, mining this footage for misdeeds. But like the bodycams themselves, the technology still faces procedural, legal, and cultural barriers to success.
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (0.88)
The Download: the problem with police bodycams, and how to make useful robots
When police departments first started buying and deploying bodycams in the wake of the police killing of Michael Brown in Ferguson, Missouri, a decade ago, activists hoped it would bring about real change. Years later, despite what's become a multibillion-dollar market for these devices, the tech is far from a panacea. Most of the vast reams of footage they generate go unwatched. And if they do finally provide video to the public, it's often selectively edited, lacking context and failing to tell the complete story. A handful of AI startups see this problem as an opportunity to create what are essentially bodycam-to-text programs for different players in the legal system, mining this footage for misdeeds.
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.99)
- Law (0.65)
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Zhang, Xinyu, Hofstätter, Sebastian, Lewis, Patrick, Tang, Raphael, Lin, Jimmy
Listwise rerankers based on large language models (LLM) are the zero-shot state-of-the-art. However, current works in this direction all depend on the GPT models, making it a single point of failure in scientific reproducibility. Moreover, it raises the concern that the current research findings only hold for GPT models but not LLM in general. In this work, we lift this pre-condition and build for the first time effective listwise rerankers without any form of dependency on GPT. Our passage retrieval experiments show that our best list se reranker surpasses the listwise rerankers based on GPT-3.5 by 13% and achieves 97% effectiveness of the ones built on GPT-4. Our results also show that the existing training datasets, which were expressly constructed for pointwise ranking, are insufficient for building such listwise rerankers. Instead, high-quality listwise ranking data is required and crucial, calling for further work on building human-annotated listwise data resources.
- Asia > Mongolia (0.14)
- Europe > Norway (0.04)
- Europe > France > Île-de-France > Val-d'Oise > Roissy (0.04)
- (15 more...)
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Yu, Wenhao, Iter, Dan, Wang, Shuohang, Xu, Yichong, Ju, Mingxuan, Sanyal, Soumya, Zhu, Chenguang, Zeng, Michael, Jiang, Meng
Knowledge-intensive tasks, such as open-domain question answering (QA), require access to a large amount of world or domain knowledge. A common approach for knowledge-intensive tasks is to employ a retrieve-then-read pipeline that first retrieves a handful of relevant contextual documents from an external corpus such as Wikipedia and then predicts an answer conditioned on the retrieved documents. In this paper, we present a novel perspective for solving knowledge-intensive tasks by replacing document retrievers with large language model generators. We call our method generate-then-read (GenRead), which first prompts a large language model to generate contextutal documents based on a given question, and then reads the generated documents to produce the final answer. Furthermore, we propose a novel clustering-based prompting method that selects distinct prompts, resulting in the generated documents that cover different perspectives, leading to better recall over acceptable answers. We conduct extensive experiments on three different knowledge-intensive tasks, including open-domain QA, fact checking, and dialogue system. Notably, GenRead achieves 71.6 and 54.4 exact match scores on TriviaQA and WebQ, significantly outperforming the state-of-the-art retrieve-then-read pipeline DPR-FiD by +4.0 and +3.9, without retrieving any documents from any external knowledge source. Lastly, we demonstrate the model performance can be further improved by combining retrieval and generation. Our code and generated documents can be found at https://github.com/wyu97/GenRead.
- North America > United States > California (0.14)
- Asia > Middle East > Iran (0.14)
- Asia > Middle East > Jordan (0.05)
- (20 more...)
- Media (1.00)
- Government (1.00)
- Food & Agriculture > Agriculture > Pest Control (0.68)
- (3 more...)
Should Local Police Departments Deploy Lethal Robots?
Last month, the San Francisco Board of Supervisors voted in favor of allowing that city's police department to deploy robots equipped with a potential to kill, should a situation--in the estimation of police officers--call for lethal force. With that decision, the board appeared to have delivered the city to a dystopian future. The vote garnered a loudly negative response from the public, and this week the supervisors reversed course and sent the policy back to committee. But the fact that the decision initially passed--and may yet pass in some form--should not have been surprising. Police departments around the country have been acquiring robotic devices for decades.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > Nevada > Clark County > Las Vegas (0.05)
- North America > United States > Missouri > Saint Louis County > Ferguson (0.05)
- (3 more...)
San Francisco will allow police to deploy robots that kill
Supervisors in San Francisco voted Tuesday to give city police the ability to use potentially lethal, remote-controlled robots in emergency situations -- following an emotionally charged debate that reflected divisions on the politically liberal board over support for law enforcement. The vote was 8-3, with the majority agreeing to grant police the option despite strong objections from civil liberties and other police oversight groups. Opponents said the authority would lead to the further militarization of a police force already too aggressive with poor and minority communities. Supervisor Connie Chan, a member of the committee that forwarded the proposal to the full board, said she understood concerns over use of force but that "according to state law, we are required to approve the use of these equipments. So here we are, and it's definitely not a easy discussion."
- North America > United States > California > San Francisco County > San Francisco (0.71)
- Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.05)
- North America > United States > Missouri > Saint Louis County > Ferguson (0.05)
- North America > United States > California > Alameda County > Oakland (0.05)